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Over the past two decades, school shootings within the United States have repeatedly devastated 
communities and shaken public opinion. Many of these attacks appear to be ‘lone wolf’ ones driven 
by specific individual motivations, and the identification of precursor signals and hence actionable 
policy measures would thus seem highly unlikely. Here, we take a system-wide view and investigate 
the timing of school attacks and the dynamical feedback with social media. We identify a trend 
divergence in which college attacks have continued to accelerate over the last 25 years while those 
carried out on K-12 schools have slowed down. We establish the copycat effect in school shootings 
and uncover a statistical association between social media chatter and the probability of an attack in 
the following days. While hinting at causality, this relationship may also help mitigate the frequency 
and intensity of future attacks. 


Extensive research has been carried out on individu¬ 
al mass shooting case studies, yielding a complex vari¬ 
ety of causes revolving around individual-centric fac¬ 
tors such as mental illness, social rejection and harass¬ 
ment m (see [5J and [6] for reviews). A sociologi¬ 
cal model to understand and prevent attacks has been 
proposed [7 and several solutions have been present¬ 
ed, including community cohesion |T] and early-signals 
detection 019]. Our work provides a significant advance 
on current understanding by providing a collective lev¬ 
el description beyond individual case studies, accom¬ 
panied by a rigorous mathematical framework. These 
results follow from our unified treatment of two comple¬ 
mentary databases (see Methods), the first (Shultz) of 
which includes fatal attacks from 1990-November 2014, 
while the second (Everytown) includes all incidents from 
2013-November 2014, irrespective of whether there were 
casualties. Furthermore, we include a database of mass 
killings (collected by USA Today), covering attacks caus¬ 
ing more than four casualties from 2006-July 2015, show¬ 
ing that the results are not exclusive of school shootings, 
but consistent across high-profile types of violence. 

Data characterization: Many human activities have 
been shown to give rise to heavy-tail distributions in the 
magnitude of the associated events and in the interevent 
times. Consistent with other human activities, we found 
heavy-tail distributions in the attack size (Fig. 0^) and 
the timing of attacks (Fig. 03) across the three databases 
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studied. Despite these data reflecting attacks with dif¬ 
ferent characteristics, all databases showed remarkable 
consistency in the interattack distribution when normal¬ 
ized by the average waiting time (Fig. [lj3). Important¬ 
ly, heavy-tail distributions in the timing of attacks show 
a deviation from a random Poisson process, where the 
event rate is uniform in time, and indicate the presence 
of underlying factors. 

The deviation from Poisson processes in complex sys¬ 
tems has been associated with burstiness m, where 
events cluster together in time (Figs. [ST)\ -D). Cluster¬ 
ing can emerge from two mechanisms jlOj. Firstly, it is 
related to the distribution of interevent times and can be 
characterized by the normalized coefficient of variance 
B = [j T /f|+i ■ B ranges between —1 for highly regu¬ 
lar processes to 0 for Poisson processes and 1 for heavy- 
tail distributions. Physiological complex systems such as 
hearbeats are highly regular, while natural and human 
activities usually exhibit large burstiness values (Un¬ 
interestingly, the distribution of time events is only mod¬ 
erately skewed, with B = 0.155,0.122 and 0.004 for the 
Shultz, Everytown and USA Today databases. These val¬ 
ues contrast with the burstiness for other human activ¬ 
ities such as emailing, library loans, printing and calls, 
that range between 0.2 and 0.65. The second mecha¬ 
nism affecting clustering is the memory of the system. 
While natural activities exhibit memory (e.g. large repli¬ 
cas follow large earthquakes), human activities have low 
to no memory jlOj . We measured the memory of the 
system using autocorrelation, which ranges between — 1 
for disassortative process - i.e. large (small) interevent 
times follow small (large) interevent times, 0 for no cor¬ 
relation and 1 for assortative processes. In contrast to 
other human activities, we found memory comparable to 
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FIG. 1. Escalation patterns in school shootings. (A) Complementary Cumulative Distribution Function (CCDF) for 
event severity (dots and solid line) and best fit (dashed line) to power-law distribution. Note that the USA Today database 
only includes attacks with four or more victims. (B) CCDF for normalized interevent times (dots and solid line). Inset show 
the CCDF of the raw intereven times. (C) Probability of attack depending on the presence of an attack in the previous seven 
days. Each bin contains one sixth of the attacks. (D-F) The escalation plot, log 10 n vs. log 10 r„, for (D) All, (E) College and 
(F) K-12 attacks using the Schulz database (Methods). LOWESS fit ($ = 0, a = 0.66) is shown in dark gray, with the years 
where the trend changes annotated. 


natural phenomena for up to five attacks (Fig. SI A.). 
Importantly, the existance of memory is linked to a four¬ 
fold increase in the probability of an attack in the days 
following a school shooting (Fig. 03). Given that the 
clustering not only arises from a skewed distribution of 
the interevent times, but also from memory, we hypothe¬ 
size the existence of an external feedback loop increasing 
the attack rate, that we later link to social media. 


To further characterize the data, we analyzed the 
interevent time distribution in detail. We apply locally 
weighted scatterplot smoothing (LOWESS) to the log- 
log plot of r„ versus n (Fig. EP -F), where the slope b 
is an indicator of changes in the attack rate m- Figure 
EP- containing all attacks, shows three regions in time: 
From 1990 until 1993, the attack rate increased steadi¬ 
ly (b > 0). From 1993 until 2003 there was a slowing 
down in the attacks (b < 0), that was interrupted around 
2003, when the escalation rate again became positive. 
Although the specific value of b depends on the correct 
determination of the first interevent time ( rp), the results 
are robust to different values of r 0 (Fig. [S2)3-C). The 
change in trend in 2003 shows that college attacks have 
been accelerating (Fig. EP) , while K-12 attacks have con¬ 


tinued slowing down (Fig. HP)- In the following sections 
we describe and analyze the results of two models that 
have been successfully applied to explain other forms of 
conflict: the Hawkes process jl2Hl4l| and the dynamical 
Red Queen or “Red versus Blue” model m- 

Models: Hawkes process: The Hawkes process is a 
self-exciting point process model described by 

X(x, t) = /i + y ^ g(x - Xj,t - tj ), (1) 

where X(x, t) is the attack rate at position x and time t, 
p is the background Poisson rate and g(x — Xi,t — ti) 
is the contribution of the attack i occurring at Xi,ti. 
Hawkes processes have been typically used to study 
earthquakes m ■ In the case of seismicity, a triggered 
earthquake is followed by aftershocks, which in turn acti¬ 
vate new aftershocks creating a cascade of events. This 
is modelled by separating earthquakes into background 
and aftershocks, where background events occur with 
a specific background rate and the probability of the 
aftershocks depends on the time and distance from pre¬ 
vious earthquakes according to the kernel g(x,t). The 
kernel can be explicitly defined m or calculated using 
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FIG. 2. Hawkes process model: Distance and time terms of the kernel function. (A) Fraction of attacks within a 
distance of each other and two null models where the attacks are drawn with probabily equal to the underlying US population 
and at times equal to the Schulz database (null model Shultz) or with frequency following a Poisson process (null model 
Poisson). The two null models overlap in this plot. (B) Intensity of attacks with respect to distance between attacks. A line 
with slope equal to 2 (i.e. the bare intensity decrease matching the increase in area) is shown for comparison. (C) Intensity of 
attacks with respect to time between attacks. 


non-parametric methods |16j . The same modelling has 
also recently been successfully applied to social phenom¬ 
ena, such as finance £7], crime |18] and terrorism in 
Irak m- Here, we used the non-parametric method 
from [if); to estimate the kernel g(x,t) and understand 
the mechanism by which school shootings trigger cas¬ 
cades of attacks. 


Marsan and Lengline’s method m uses an 
expectation-maximization algorithm on the binned 
events (earthquakes in their case). It iteratively decou¬ 
ples the events into background and triggering events 
using g, and g , and updates /i and g using the new 
decoupling of events until convergence is obtained. The 
original algorithm revealed a linear scaling between 
the magnitude of the event and the probability of an 
aftershock. However, this either does not apply for 
school shootings or the difference is too small to quantify 
given the sparsity of our data (Fig. S3A-B). Therefore, 
we excluded the magnitude of the attack from the study 
and calculate the relationship between the probability of 
new attacks given the time and distance since previous 
attacks. In order to put our results in perspective, we 
created two null models where the attacks were drawn 
at random from US cities with probability proportional 
to their population (using the Geonames database). The 
first model uses the timing from the Schulz database, 
whereas in the second model the attacks occur as a 
Poisson process with A equal to the mean interevent 
time in the Schultz database A = Tshuitz = 37.5 days. 


First, we analyze the the fraction of pairs of attacks 
that are located within a specific distance of each oth¬ 
er (Fig. [2]). The distance between all attack pairs is 


similar to that expected if the attacks were distribut¬ 
ed proportional to the US population. Next, we ana¬ 
lyzed the effect of time and distance in the spreading 
of attacks. Figures [2J3-C show the two components of 
the kernel function g , the intensity decrease as a func¬ 
tion of the distance between attacks (Fig. m and the 
decrease as a function of time between attacks (Fig. HP). 
If the attacks were uniformly distributed, the algorithm 
would assign a low weight to the kernel function (Fig. 
|S4[4-C). However, we obtained a consistent form of the 
kernel function for all three databases studied. Both the 
distance between attacks and the time between attacks 
diminish the probability of new attacks as an approxi¬ 
mate power-law. Moreover, although the consistency in 
distance can be explained by the underlying distribution 
of population (Fig. |2j3) , the consistency in timing cannot 
be explained by an underlying Poisson process (Fig. |2p). 
Thus our results indicate that while the attacks occur 
approximately at random in space, with the exception 
of within-town attacks, the attacks do affect the timing 
of new shootings, increasing the rate of attacks by a 3- 
10 fold, especially in ten days following the attack. The 
Hawkes model confirms the presence of attack cascades, 
and quantifies the effect of distance and time in the prob¬ 
ability of new attacks. 

Red versus Blue model: Empirical and theoretical 
studies have shown that the trend in timings and dis¬ 
tribution of severities of attacks in human conflicts are 
described by the power laws r ra = T\n~ b and p{s) oc s“ 
respectively, where r n is the time between attacks n and 
n + 1, b is the escalation rate, s is the attack severi¬ 
ty, and a ~ 2.5 ns 120 ]- Positive values of the escala- 
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FIG. 3. Red vs Blue model: Attack characteristics. (A) Histogram of ^ e ,•. Early and Late attacks are marked in 
blue and orange respectively. (F) Prediction plot, log 10 n vs. 6. All states with at least five events are considered. States 
above the 6 = 0 line experienced an escalation in the number of attacks. (B-E) Attack characteristics for All (Grey), Early 
(Blue) and Late (Orange) attacks. (B) Number of tweets preceding the attacks. (C) Average casualty number. (D) Fraction 
of attacks with victims. (E) Fraction of attack ending in suicide. The updated Shultz et al. database was used for all plots 
except otherwise noted. 


tion rate b reflect an increase in the frequency of attacks 
with time, while the attack rate decreases if b is negative. 
An explanatory model emerges from consideration of the 
confrontation dynamics between two opponents )20j . In 
our case, the two ‘opponents’ are the pool of potential 
attackers which we call Red, none of whom are necessar¬ 
ily in contact with or know each other, and Society which 
we call Blue. At any one instance, Red tends to hold a 
collective advantage R over Blue in that Red is largely 
an unknown threat group residing within Blue. The size 
of this advantage depends on the number of potential 
attackers and their resources. Each attack can affect the 
balance between Red and Blue, for example by increas¬ 
ing R mm]. It is reasonable to assume that the main 
changes in Red’s lead R over Blue occur just after a new 
attack, e.g. due to media coverage. This is confirmed 
empirically by the increased probability of a subsequent 
attack (Fig, [lp), as well from the results of the Hawkes 
model (Fig. EC). If the changes in R are independent and 
identically distributed, the Central Limit Theorem states 
that the typical value of R after n attacks, R{n), will be 
proportional to n , where b = 0.5 EU- For the more gen¬ 
eral case where changes in R depend on the history of 


previous changes, b will deviate from 0.5 corresponding 
to ‘anomalous’ diffusion [22]. Taking the frequency of the 
attacks to be proportional to Red’s advantage over Blue, 
we obtain r n = T\n~ b . 

Our theory predicts that the time to the n th attack 
is determined by the progress curve r„ = Tin~ b . The 
progress curve assumes that the time to the next attack 
is deterministic. However, in reality one can imagine that 
a series of N background processes would need to ‘fall 
into place’ before a potential attacker finds himself in an 
operational position to carry out an attack and hence 
provide the (n + l) th attack. The triggering of each of 
these N processes may independently fluctuate and so 
delay or accelerate the next attack. Similar to multiplica¬ 
tive degradation processes in engineering, we assume that 
each of these steps multiplies the expected time interval 
by a factor (1 + ej) where the stochastic variables e/s 
mimic these exogenous factors. It is reasonable to assume 
that the values of the ej ’s are independent and identically 
distributed, which means that their sum (i.e., the noise 
term in the progress curve fit) is approximately Gaus¬ 
sian distributed with zero mean (Fig. [3jA). The observed 
time interval is now given by r ra = T\n b nf=i (i + e j)- it 
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FIG. 4. Feedback loop between school shootings and mass media. (A) Time series of the number of tweets containing 
“school” and “shooting” (Red lines, left axis), and the severity of attacks (right axis) for Early attacks (Blue), Late attacks 
(Orange) and the rest (Grey). (B) Sandy Hook incident. (C) Probability of an attack happening in the 7, 18 or 45 days 
following attack n, as a function of the mean number of tweets with the words “school” and “shooting” at days n and n + 1. 
The Shultz database was used for all plots. 


then follows that log r„ = log n — b log n + tj , since 
log (1 + 6j ) ~ 6j if Cj < 1. Hence the progress curve 
represents a straight line fit through a maximum like¬ 
lihood approach on a log-log plot, exactly as assumed 
by our LOWESS analysis (Fig. EP -F) where residuals 
are Gaussian distributed. The attacks whose ^ ej devi¬ 
ates from zero are likely to have distinctive characteris¬ 
tics. We labeled the attacks where ^ Cj is larger than 
one standard deviation as Late and the ones where it is 
smaller than one negative standard deviation as Early 
(Fig. |A). We found that Early attacks are correlated 
with high media activity (Fig. §*)> as expected since 
those attacks take place while the news about the previ¬ 
ous one have not fade out. We also observed that Late 
attacks are both more deadly (Fig. [3jG D) and result 
more frequently in the suicide of the attacker than Early 
attacks (Fig. |3j5). We identify Late attacks with planned 
attacks whose attackers provide a continued leakage of 
clues over time [B;. 

The Red versus Blue model uncovers an unexpect¬ 
ed inter-relationship between the patterns of lone-wolf 


school attacks in different geographical locations. If 
events in different locations were independent, one would 
not expect any relationship between the log t\ and b in 
different locations. However Figure [3jT shows that the 
opposite is true. The presence of a linear relationship 
among these different states, as well as the presence of a 
significant kernel function in the Hauikes model, indicates 
that there is a common dynamical factor influencing oth¬ 
erwise independent attackers across different states. Our 
analysis suggests that the cause of this common dynam¬ 
ical factor lies in modern media sources. 

The copycat effect: Our hypothesis that the interac¬ 
tion between attacks is indirect through the media is a 
phenomenon commonly known as the copycat effect |23j . 
This interaction can be attributed to an acute ‘issue- 
attention cycle’ [2J] with the media reacting strongly to 
every attack [25]. Although the effect of mass media has 
been studied, evidence of copycats has been anecdotal [6]. 
To analyze the role of social media (which echos and 
amplifies all media), we obtained 72 million tweets con¬ 
taining the word “shooting”. From these, over 1.1 mil- 
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lion tweets contained the word “school”. Figures [4j4 B 
visualizes the relationship between the number of tweets 
containing the words “school” and “shooting” with the 
Early and Late attacks. As expected given that a peak 
in Twitter activity follows every attack, Early attacks 
are correlated with periods of high Twitter activity. To 
study the interaction between social media and school 
shootings, we plotted the average number of tweets con¬ 
taining the words “school” and “shooting” against the 
probability of an attack in the next 7, 17 and 44 days, 
corresponding to percentiles 25 th , 50 th , and 75 th of the 
distribution of the days between attacks. Fig. HP shows 
that the probability of an attack increases with the num¬ 
ber of tweets talking about school shootings. For exam¬ 
ple, the probability of an attack in the next week doubles 
when the number of school shooting tweets increases from 
10 to 50 tweets/million. By contrast, tweets containing 
only “shooting” or “mass” and “murder” did not show a 
pronounced effect (Fig. S5 ). Our analysis thus confirms 
that social media publicity about school shootings corre¬ 
lates with an increase in the probability of new attacks. 


Our mathematical theory explains and predicts the 
probablistic escalation patterns in school shootings. Our 
theory is supported by analysis of an FBI dataset of 
active shooting j2B] (Figs. S6 and S7 Supplementary 


Information), and has implications in attack prevention 
and mitigation. First, the discovery of distinct trends 
for college and K-12 attacks should motivate policy mak¬ 
ers to focus policy efforts in distinct ways for these two 
educational settings. Second, the presence of underly¬ 
ing patterns in the data can improve both short-term 
and long-term prediction of future trends, for example by 
focusing the efforts in the cities where there has already 
been already an attack. Finally, our analysis proves for 
the first time the copycat effect in school shootings, a 
topic which has been analyzed primarily in a narrative, 
case-by-case way to date. Our results do not contradict 
the fact that the psychological aspect of the attacker is a 
key factor in an individual attack, or that traditional pre¬ 
vention methods work, but instead draw a new collective 
example of human conflict in which a small, dynamical, 
violent sector of society confronts the remainder fueled 
by Blue’s own informational product (media). 
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METHODS 

Methods 

Databases We studied the following datasets: Every- 
town: The attacks, with and without victims, were 
extracted from http://everytown.org/, containing all 
incidents from the period January 2013 to November 
2014. Shultz: The database for the period 1990-2013 
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gathered by Shultz et al. m was updated with the 
Everytown database to include recent attacks with vic¬ 
tims up to November 2014. USA Today: The database 
for the period 2006-July 2015 gathered by http://www. 
usatoday. com/, including all attacks with four or more 
victims. Active shootings: The date, size, age of the 
attacker and suicide result was obtained from the 2014 
FBI report A Study of Active Shooter Incidents, 2000 
2013 PSj- Twitter: 57 billions tweets were analyzed 
in the period 2010 to November 2014, extracting over 
72 million tweets with the word “shooting”, 1.1 million 
with the words “shooting” and “school”, and 233 thou¬ 
sand with the words “mass” and “murder”. 

Active shootings We repeated the analysis with the 
160 active shootings events from the FBI database 1 261 . 
In this case, the distribution of attack sizes does not fol¬ 
low a power law (Fig. S6 4). However, this is likely due 
to the definition of active shooting, where attacks with a 
low number of casualties do not tend to be included in the 
study. In agreement with the results of the report @,we 
find a steady rise in the frequency of attacks (Fig. S6 3). 
Consistent with our results of school shootings, the time 
between the two first attacks is a good indicator of the 
subsequent escalation pattern (Fig._ 
interaction between attacks (Fig. 


S6 3). We found an 


S6 3), which can be 


attributed to the copycat effect, since the probability of 
an attack in the subsequent 8, 19 and 35 days is corre¬ 
lated with the number of tweets containing “shooting” 
(Fig. [S6^), or “school” and “shooting” (Fig. [S6^), but 
not “mass” and “murder” (Fig. |S6p ). We can define 
again Early and Late attacks (Fig. |S7[4), that correlate 
with Twitter activity (Fig. |S7p -C). However, the size of 
the attacks in this case is not different for Eaily and Late 
attacks (Fig. [S7)l). 

Finally, we analyzed the correlation between age, size, 
and suicide rates (Fig. [S7p -G). We found a positive 
correlation between age and attack size (Fig. S7E). 


Teenagers (ages 12-18) correlate with small size events 
(Fig. [S7p) and low suicide rates (Fig. [S7p). Young 
attackers (ages 18-38) exhibit high suicide rates (Fig. 
|S7|E). The size of the attack is not well correlated with 
suicide rates, with the exception of attacks without vic¬ 
tims (Fig. S7 3). 
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FIG. SI. (A) Autocorrelation ( ACf(n ) = N ^_ n 1 /(f + n)/(t)) f° r the interevent time series (/) for the Shultz, 

Everytown and USA Today databases at different lags (n). The interevent time series has been normalized by substracting the 
mean and dividing by the standard deviation. (B) Attack series using the normalized interevent time. Vertical bars correspond 
to individual attacks. (C) Attack series by state in the Shultz dataset. (D) Attack series in the 8 towns with two or more 
attacks in the Everytown dataset. 
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FIG. S2. The progress plot log 10 n vs. log 10 t„, using all attacks ([0 — n]), attacks [1 — n], [3 — n] and [10 — n] in the Shultz 
database. 
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FIG. S3. Kernel function of the Hawkes process by magnitude of attack. Intensity of attacks with respect to (left) distance 
between attacks and (right) time between attacks. 
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FIG. S4. (A) Fraction of attacks within a distance of each other, two null models where the attacks are drawn with probabily 

equal to the underlying US population and at times equal to the Schulz database (null model Shultz) or with frequency following 
a Poisson process (null model Poisson), and another null model where the attacks are drawn from the US area at random and 
at times equal to the Schulz database. (B) Intensity of attacks with respect to distance between attacks. (C) Intensity of 
attacks with respect to time between attacks. 
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FIG. S5. Probability of an attack happening in the 7, 18 or 45 days following attack n, as a function of the mean number of 
tweets with the words (A) “mass” and “murder” and (B) “shooting” at days n and n+ 1. The Shultz database was used for 
all plots. 
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FIG. S6. Active shooting a (A) Complementary Cumulative Distribution Function (CCDF) for event severity (blue dots 
and solid line) and best fit (dashed line) to lognormal distribution. (B) The progress curve, log 10 n vs. log 10 r„, for all attacks. 
LOWESS fit (5 = 0, a = 0.66) is shown in dark gray, with the years where the trend changes annotated. (C) Prediction plot, 
l°g 10 Tl vs - AH states with more than four events are considered. States above the 6 = 0 line experienced an escalation in 
the number of attacks. (D) Probability of attack depending on the presence of an attack in the previous seven days. Every bin 
contains one third of the attacks. (E-G) Probability of an attack happening in the 8, 19 or 35 days following to attack n, as a 
function of the mean number of tweets talking about shootings at days n and n + 1. 
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FIG. S7. Active shooting b (A) Histogram showing ^ £j- Early and Late attacks are marked in blue and orange, respectively. 
(B) Time Series of the number of tweets containing “mass” and “shooting” or “murder” (Red lines, left axis), and the size of 
attacks (right axis) for Early attacks (Blue), Late attacks (Orange) and the rest (Grey). (C) Median number of tweets in the 
five days preceding All (Grey), Early (Blue) and Late (Orange) attacks. (D) Average casualty number for All (Grey), Early 
(Blue) and Late (Orange) attacks. (E) Probability of different magnitude of events by age group. (F) Probability of suicide by 
age group. (G) Probability of suicide by size of attack group. 

















































